Mist : Efficient Dissemination of Erasure-coded Data in Data Centers

نویسندگان

  • Jun Li
  • Baochun Li
  • Bo Li
چکیده

Data centers store a massive amount of data in a large number of servers built by commodity hardware. To maintain data integrity against server failures, erasure codes have been extensively deployed in modern data centers to provide a higher level of failure tolerance with less storage overhead than replication. Yet, compared to replication, disseminating erasure-coded data from a source server into multiple servers will also take significantly more time. In this paper, we design and implement Mist, a new mechanism for disseminating erasure-coded data efficiently to multiple receiving servers (receivers) in data centers. Mist speeds up the dissemination process by building an efficient topology among the receivers with heterogeneous performance, so that coded data can be received from other receivers in a pipelined fashion, rather than directly from the source. Mist flexibly supports a wide range of erasure codes, without imposing constraints to the range of system parameters, and can be extended for specific erasure codes with better performance by taking advantage of the corresponding erasure code. We have implemented Mist in Python, and our experimental results in Amazon EC2 have demonstrated that the dissemination time can be reduced by up to 96.3% with different kinds of erasure codes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Insertion and Archiving in Erasure-Coding Based Large-Scale Storage Systems

Given the vast volume of data that needs to be stored reliably, many data-centers and large-scale file systems have started using erasure codes to achieve reliable storage while keeping the storage overhead low. This has invigorated the research on erasure codes tailor made to achieve different desirable storage system properties such as efficient redundancy replenishment mechanisms, resilience...

متن کامل

On repairing erasure coded data in an active-passive mixed storage network

Citation Oggier, F., & Datta, A. (2015). On repairing erasure coded data in an active-passive mixed storage network. International journal on information and coding theory, 3(1). Abstract: A major change has been recently witnessed in networked distributed storage systems (NDSS), with increased use of erasure codes in lieu of replication for realizing data redundancy. Yet, both the industry and...

متن کامل

A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster

Erasure codes, such as Reed-Solomon (RS) codes, are being increasingly employed in data centers to combat the cost of reliably storing large amounts of data. Although these codes provide optimal storage efficiency, they require significantly high network and disk usage during recovery of missing data. In this paper, we first present a study on the impact of recovery operations of erasure-coded ...

متن کامل

Erasure Coding in Distributed Storage Systems

Data centers are nowadays highly distributed, possibly over continents, to cope with the huge amounts of data collected these days. To deal with data loss in an environment where hardware failures are very common, researchers have developed new kinds of erasure codes. Erasure coding is a common technique used in all sorts of digital communication, including deep space communication and QR-codes...

متن کامل

A Reed-Solomon Code for Disk Storage, and Efficient Recovery Computations for Erasure-Coded Disk Storage

Reed-Solomon erasure codes provide efficient simple techniques for redundantly encoding information so that the failure of a few disks in a disk array doesn’t compromise the availability of data. This paper presents a technique for constructing a code that can correct up to three errors with a simple, regular encoding, which admits very efficient matrix inversions. It also presents new techniqu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018